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CLUSTER  ANALYSIS  OF  OCCUPATIONAL  DATA 
WITH  FOCUS  ON  TASK  RATHER  THAN  PEOPLE 


A  "Job"  may  bo  defined  as  a  grouping  of  tasks  performed  by  an 
individual  to  accomplish  some  purpose  within  an  organization.  Usually, 
the  tasks  which  make  up  a  Job  have  a  meaningful  relationship  with  one 
another.  That  is,  they  might  involve  similar  skills  and  requirements 
or  they  might  be  related  by  environmental  factors,  such  as  physical  and 
temporal  proximity.  N/The  job  analyst  is  concerned  with  identifying  and 
systematically  recording  the  behaviors  performed  by  job  incumbents. 

From  the  collection  and  analysis  of  these  data  by  the  job  analyst, 
inferences  can  be  drawn  and  useful  recommendations  made  regarding  such 
matters  as  personnel  selection  policies,  training  programs,  planning 
manning  tables  and  force  studies.  One  of  the  important  needs  of  the  job 
analyst  for  accomplishing  these  ends  is  a  method  of  grouping  tasks  into 
meaningfully  useful  clusters.  One  such  method  is  cluster  analysis,  i.e. 
a  technique  by  which  entities  are  formed  into  relatively  homogeneous 
groups,  based  on  similarity  measures.  The  usual  procedure  in  such 
analysis  is  to  measure  a  number  of  attributes  of  the  entities  and  by 
pairwise  comparisons  of  the  entities  (and/or  subclusters  of  the  entities) 
form  clusters  based  on  the  similarity  of  their  respective  attributes.  ?' 

■When  applied  in  occupational  analysis  these  techniques  can  cluster 
individuals  (entities)  on  the  basis  of  the  tasks  (attributes)  they 
perform.  The  results  of  this  process  are  clusters  of  people  who  perform 
similar  jobs .  — 

OBVERSE  CLUSTER  ANALYSIS 

Obverse  cluster  analysis  is  a  modification  of  the  usual  clustering 
procedures  so  that  clusters  of  tasks  are  constructed  on  the  basis  of 
individuals  who  perform  them.  The  task  measurements  used  are  the  same 
as  in  a  traditional  method  of  clustering.  However,  in  obverse  clustering 
u  task  is  clustered  with  another  task  depending  upon  how  many,  or  few, 
individuals  perform  both  tasks.  . 

\ 

COMPARISON  OF  TWO  METHODS 

A  simple  comparison  of  the  two  methods  is  provided  by  analyzing  the 
illustrative  data  in  Table  1. 


Bailey,  D.  E.,  and  Tyron,  R.  C.  Cluster  Analysis.  McGraw-Hill,  1970. 
~~ '  Bottenberg,  R.  A.,  and  Christal,  R.  E.  An  iterative  technique  for 

clustering  criteria  which  retains  optimum  predictive  efficiency. 

The  Journal  of  Experimental  Education,  36,  (A),  Summer  1968,  28-34. 


Table  1 


ILLUSTRATIVE  EXAMPLE  OF  DATA  ON  TA SKS  AND  I N PI  V] DUALS 


Individuals 

Tasks 

1  2  3 

4 

5 

A 

1  1 

1 

0 

0 

B 

0  0 

0 

1 

1 

C 

1  1 

1 

0 

1 

D 

0  0 

0 

1 

1 

E 

1  0 

1 

0 

0 

Assume  that 

five 

individuals  (A,  B, 

C,  D,  E)  have 

been  asked  if  they 

perform  each  of 

five 

tasks  (1,  2,  3,  4, 

5). 

In 

Table 

1,  the  answers  are 

recorded  as:  1 

=  do 

perform  the  task,  0  =  do  not  perform  the  task. 

Conceptually,  clustering  may  proceed  in 

stages  from  a 

first  stage  where 

each  individual 

is  a 

cluster  of  one  to  a  final  stage 

where  the  total 

group  of  individuals 

are  clustered  together 

with  successive  intermediate 

stages  which  are  determined  by  the  relative  pairwise  similarity  of 
individuals  (and/or  clusters)  to  one  another. 

CLUSTERING  INDIVIDUALS 

If  one  wants  to  cluster  the  individuals  of  the  example  above 
(a  common  clustering  objective  in  personnel  research),  the  similarity 
measure  one  would  use  is  the  number  of  tasks  individuals  perform  in 
common.  Thus,  in  the  first  stage  individuals  A  and  C  would  be  clustered 
together  (C^),  because  they  perform  three  tasks  in  common.  In  the  next 
stage  individual  E  would  be  added  to  because  he  is  more  similar  to 
the  members  of  this  cluster  than  to  either  of  the  other  individuals. 
Finally,  individuals  15  and  D  would  join  together  to  form  cluster  C.,, 
because  they  are  more  similar  to  one  another  than  to  members  of  Cj .  For 
job  analysts  some  grouping  between  individuals  and  the  total  group  may 
be  useful  for  identifying  individuals  who  perform  similar  jobs. 

015VERSK  CLUSTER  ANALYSIS 

To  perform  an  obverse  cluster  analysis  of  the  data  in  Table  1,  the 
measure  of  similarity  among  tasks  would  be  the  number  of  individuals  who 
perform  a  pair  of  tasks.  For  example,  tasks  1  and  3  arc  both  performed 
by  three  individuals  (A,  C,  E),  so  they  would  form  the  first  obverse 
cluster  (OC^)  in  the  matrix.  In  the  next  stage  task  2  would  be  added  to 
OCi  because  it  is  performed  by  two  individuals  (A  and  C)  who  perform  the 
member  tasks  in  OC^.  Finally,  tasks  4  and  5  would  be  clustered  together 
to  form  a  cluster  (OC2),  because  they  arc  more  similar  to  one  another 
than  to  the  tasks  in  OC^ . 

This  example  illustrates  that  the  same  matrix  can  be  employed  to 
cluster  either  individuals  or  tasks.  It  should  be  noted  that  the  matrix 
usually  is  rectangular,  i.e.,  the  number  of  individuals  need  not  be  the 
same  as  the  number  of  tasks. 
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Tin:  OODAP  SYSTI'M 


The  Department  of  Defense  is  supportin',  a  sot  of  computer  programs 
for  occupational  analysis  call  oil  the  Computerized  Occupational  Data 
Analysis  Programs  (fOPAP)  system.  The  OODAP  system  was  developed  hv 
the  Air  Force''  and  is  being  used  by  Navy  and  Marine  Corps.  One  part 
of  the  COl'AP  system  is  a  clustering  program  to  identify  individuals  or 
groups  of  individuals  who  perform  similar  jobs.  Two  measures  of  job 
similarity  may  he  used  as  a  basis  for  clustering,  either  l ho  percentage 
of  time  that  each  individual  devotes  to  each  task  he  performs  or  the 
list  of  tasks  he  performs  on  the  job. 


THE  CODAP  PROCEDURE  FOR  CLUSTER  1  NO  INDIVIDUALS 

To  accomplish  the  clustering  of  individuals,  the  program  casts  the 
N  individuals  into  an  NxN  table  called  a  "similarity  matrix"  where  the 
entries  are  the  percentage  of  overlap  in  t ho  tasks  performed  by  each 
individual  with  each  of  the  other  individuals.  The  program  then  scans 
the  similarity  matrix  and  locates  the  overlap  value  of  the  two  individ¬ 
uals  whose  task  inventories  indicate  the  greatest  percentage  of  overlap 
in  their  jobs  and  combines  them  into  a  "cluster."  The  number  of  overlap 
values  in  the  similarity  matrix  is  thereby  reduced  to  (N-l.)  x  (N-l).  In 
the  next  stage  the  program  clusters  the  two  most  similar  of  the  remaining 
N-l  entities  (i.e.,  N-L  individuals  and  one  cluster)  in  the  matrix  by 
using  the  criterion  ot  the  entry  with  the  greatest  percentage  ol  overlap. 
This  may  involve  either  adding  an  individual  to  the  existing  cluster  or 
uniting  two  individuals  to  form  a  new  cluster.  Either  union  will  reduce 
the  number  of  entities  to  N-L  and  the  similarity  matrix  will  be  collapsed 
to  (N-L)  x  (N-L)  .  The  process  cont  inues  itt  successive  stages  by  combining 
the  pair  with  the  most  similar  sots  of  tasks  performed  (i.e.,  the  greatest 
percentage  of  overlap  in  t  ho  similarity  matrix)  until  the  set  of  N  indi¬ 
viduals  is  expressed  as  a  small  group  ol  clusters  and  the  similarity 
matrix  is  reduced  appropriately. 

After  the  clustering  process  is  completed,  the  job  analyst  is  provided 
with  a  printout  which  indicates  both  the  percentage  of  overlap  between  the 
two  groups  clustered  at  each  stage  in  the  process  and  the  average  per¬ 
centage  of  overlap  among  all  members  within  the  newly  formed  cluster. 

From  the  latter  measure  the  analyst  is  able  to  determine  at  what  s t  age 
in  the  clustering  process  lie  wants  to  examine  the  groups  which  have  been 
formed.  That  is,  he  will  identity  the  clusters  which  have  the  degree  of 
individual  similarity  among  members  useful  for  the  analyst's  purpose. 

The  analyst  can  also  examine  a  printout  which  assigns  a  number  to  every 
individual  in  such  a  way  that  all  individuals  clustered  at  any  stage  are 
listed  together  and  may  be  found  within  a  certain  sequence  range. 


^ -Ohristal ,  R.  E.,  and  .1 .  11.,  dr.  The  MAX  OF  clustering  Model. 

in  Proceedings:  Conference  on  cluster  analysis  oi  mult i variate  data, 
New  Orleans,  December  1966.  Washington:  Of f  ice  of  Naval  Research, 


AR1  ORVKRSK  Cl.l'STI'RlNi:,  AN  ADDITION  10  l  UK 
CODAP  SVSTI'M 


The  Army  Research  Institute  (ARl)  has  designed  a  system  which  uses 
the  same  input  as  the  COPAP  system  with  tasks  in  .in  NxN  similarity 
matrix,  but  which  locates  the  tasks  which  the  greatest  number  ot 
inti ivlchta  1  s  perform  in  common  and  combines  them  inti'  a  cluster.  The 
task  similarity  matrix  is  (licit  reduced  to  (N-  )  x  (N-i).  In  the  next 
stage  the  program  clusters  the  two  most  similar  ol  the  remaining  N-! 
tasks  and/or  subclusters  based  on  t  lit'  number  o!  individuals  who  per  term 
them.  Analogous  to  the  clustering  oi  individuals  described  previously, 
obverse  clustering  involves  adding,  a  task  to  the  cluster  or  unit  in  .  two 
tasks  to  form  a  new  cluster  and  thereby  reducing  the  similarity  matrix 
to  (N-0)  x  (N-kl)  entries  for  X-  '  tasks  and/or  clusters  until  the  entire 
set  of  N  tasks  arc  included  in  a  single  cluster  containing  all  tasks. 

In  addition  to  clusterings  tasks,  the  ARl  addition  tv’  the  COPAP  system 
outputs  an  ordered  list  of  tasks  which  reflect  the  content  ol  the 
clusters  and  provides  intormation  about  hierarchies  ot  task  clusters 
(i.e.,  subclusters  within  clusters,  etc.). 


AN  1 MMKP 1  ATT.  USD  FOR  ORVKRSK  t'l  PSTI'.K  I  NO 

An  immediate  application  for  oh verso  clustering  was  provided  by  the 
development  of  duty  modules  as  part  ol  a  contract  with  the  American 
Institutes  for  Research  (AIR),  on  "A  Taxonomic  Rase  tor  Future  Management 
Information  and  Decision  System."  Duty  modules  arc  groups  ot  tasks 
that  tend  to  "go  together"  in  meantngtul  ways  and  winch  satisiv  ceiiaiu 
operational  requirement s  of  utility.  The  obverse  clustering  ol  tasks 
performed  by  incumbents  within  MOS  was  used  as  a  eompai ison  with  dul v 
modules  which  have  been  independently  developed  from  job  descriptions 
and  expert  judgments. 

Preliminary  data  from  task  inventories  administered  by  the  Army 
Office  oi  Personnel  Operations  to  incumbents  ot  MOS  11  D  f Armor  Recon¬ 
naissance  Specialist)  and  contained  in  the  Military  Occupat ions  Data  Rank 
(MOOR)  were  used  in  a  tryout  of  obverse  clustering.  fable  suiniiuri.es 
the  correspondence  between  seme  ol  the  task  clusters  empirically  idenlitie 
by  COOAF  obverse  analysis  for  11  D  incumbents  and  "duty  modules"  tor  the 
same  MOS  developed  by  the  expert  judgment  ol  an  AIR  team  ot  veeentlv 
retired  Army  personnel. 


Stephenson,  R.  W.  A  taxonomic  base  for  future  management  information 
and  decision  systems:  A  common  language  for  resource  and  requirement 
planning.  ARl  Technical  Research  Note  October  loj-;\ 
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Table  2 


PERCENT  OF  TASKS  IN  EMPIRICALLY  IDENTIFIED  CODAP  CLUSTERS 
THAT  FALL  INTO  DERIVED  DUTY  MODULES 


CO  DAP 
Obverse 
Cluster 
Number 


Number 

of 

Tusks  In 
Cluster 


DUTY  MODULES 


ADMINISTRATION 

A 


TRAINING 

B 


The  rows  in  Tub  1 e  2  correspond  to  the  CODAP  task  clusters  based  on 
the  frequency  with  which  the  tasks  are  assigned  to  the  same  people  in 
the  field.  That  is,  insofar  as  the  lists  of  tasks  are  sufficiently 
comprehensive  to  cover  all  duties,  these  clusters  represent  one  concept 
of  the  "real  world"  of  actual  assignment  practices  as  reported  by  job 
incumbents.  The  columns  in  Table  represent  some  of  the  duty  modules 
derived  by  experienced  judgment  tor  the  same  MOS.  The  entries  in  Table 
indicate  what  percentage  of  the  tasks  in  each  of  the  empirically  identi¬ 
fied  (i.e.,  CODA1'  obverse  method')  clusters  are  also  contained  in  duty 
modules  A-l  through  B-:\  Examples  of  the  tasks  included  in  duty  modules 
A-l  through  B-L'  are  provided  in  Table  3» 


EXAMPLES  of  the  tasks  included  in  duty  modules  A-l  THROUGH  13-2 


Duty  Module 

Title 

Task  Example 

A-l 

Performs  general  administration 
at  company  level  headquarters 

Prepare  unit 
morning  report 

A-2 

Performs  unit  supervision  and 
control  of  personnel 

Schedule  leaves 
and  passes 

A-3 

Establishes  and  operates  a 
unit  mail  room 

Receive  and 
distribute  per¬ 
sonal  mail 

A-4 

Types ,  f 1 1 o s  and  per f orms 
general  clerical  operations 

Cut  stencils 
and  ditto 

masters 

B-l 

Conducts  unit  and  individual 
training 

Prepare  lesson 
plans  and  train¬ 
ing  aids 

13-2 

Supervises  and  coordinates 
training  in  the  unit 

Evaluate  person¬ 
nel  and  recom¬ 
mend  training 

If  the  percentage  agreement  in  Table  2  is  high,  one  may  conclude  that 
duty  modules  are  comparable  to  the  actual  current  assignment  of  duties  to 
individuals  in  the  lield.  This  conclusion  is  justified  because  the  prob¬ 
ability  of  tasks  being  clustered  together  is  based  on  the  frequency  with 
which  they  .ire  performed  by  the  same  people.  It  should  be  pointed  out 
that  the  percentages  in  Table  2  are  strongly  affected  by  the  following 
conditions: 

1.  The  tasklist  for  a  given  MOS  maj  or  may  not  adequately  sample 
the  tasks  performed  by  an  incumbent. 

2.  The  tasklist  for  a  given  MOS  may  or  may  not  include  all  the  tasks 
which  make  up  the  duty  modules  for  that  MOS. 

3.  Duty  assignments  may  exist  which  are  inappropriate  to  a  duty  posi¬ 
tion  designation  within  the  MOS;  this  will  tend  to  reduce  the  apparent 

fit  of  empirically  derived  clusters  with  duty  modules  (i.e.,  assignment 
practices  are  not  necessarily  perfect  and  may  reflect  a  variety  of  con¬ 
tingencies)  . 


T!u-  actual  assignments  reported  in  tin  questionnaires  may  not 
have  included  important  missions  that  mi, On  he  critical  in  combat  but 
not  encountered  by  most  soldiers  during  the  period  covered  hy  the 
quest ionna i res . 


In  summary,  for  the  data  available  to  date 
derived  duty  modules  do  overlap  to  a  moderate 
clusters  derived  hy  t'OPAP  obverse  method,  even 
• o  mitigate  this  correspondence . 


,  it  appears  that  rationally 
degree  with  empirical  task 
when  conditions  may  serve 


Old  IKK  A  Pi'Ll  CATIONS 

Other  possible  applications  of  obverse  clustering  include  identifying 
unit  and  individual  performance  criteria  based  on  actual  duties  performed 
ami  forecasting  equipment  and  maintenance  needs  based  on  measures  of  use 
by  personnel. 


